Efficient Barrier Using Remote Memory Operations on VIA-Based Clusters
نویسندگان
چکیده
Most high performance scientific applications require efficient support for collective communication. Point-to-point message-passing communication in current generation clusters are based on Send/Recv communication model. Collective communication operations built on top of such point-to-point message-passing operations might achieve suboptimal performance. VIA and the emerging InfiniBand architecture support remote DMA operations, which allow data to be moved between the nodes with low overhead, they also allow to create and provide a logical shared memory address space across the nodes. In this paper, we focus on barrier, one of the frequently-used collective operations. We demonstrate how RDMA write operations can be used to support inter-node barrier in a cluster with SMP nodes. Combining this with a scheme to exploit shared memory within a SMP node, we develop a fast barrier algorithm for cluster of SMP nodes with cLAN VIA inteconnect. Compared to the current barrier algorithms using Send/Recv communication model, the new approach is shown to reduce barrier latency on a 64 processor (32 dual nodes) system by up to 66%. These results demonstrate that high performance and scalable barrier implementations can be delivered on current and next generation VIA/Infiniband-based clusters with RDMA support.
منابع مشابه
Fast and Scalable Barrier Using RDMA and Multicast Mechanisms for InfiniBand-Based Clusters
This paper describes a methodology for efficiently implementing the collective operations, in this case the barrier, on clusters with the emerging InfiniBand Architecture (IBA). IBA provides hardware level support for the Remote Direct Memory Access (RDMA) message passing model as well as the multicast operation. Exploiting these features of InfiniBand to efficiently implement the barrier opera...
متن کاملEfficient Collective Operations Using Remote Memory Operations on VIA-Based Clusters
High performance scientific applications require efficient and fast collective communication operations. Most collective communication operations have been built on top of point-to-point send/receive primitives. Modern user-level protocols such as VIA and the emerging InfiniBand architecture support remote DMA operations. These operations not only allow data to be moved between the nodes with l...
متن کاملFast Collective Operations Using Shared and Remote Memory Access Protocols on Clusters
This paper describes a novel methodology for implementing a common set of collective communication operations on clusters based on symmetric multiprocessor (SMP) nodes. Called Shared-Remote-Memory collectives, or SRM, our approach replaces the point-to-point message passing, traditionally used in implementation of collective message-passing operations, with a combination of shared and remote me...
متن کاملProtocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters
The paper describes software architecture for supporting remote memory operations on clusters equipped with high-performance networks such as Myrinet and Giganet/Emulex cLAN. It presents protocols and strategies that bridge the gap between user-level API requirements and low-level networkspecific interfaces such as GM and VIA. In particular, the issues of memory registration, management of netw...
متن کاملEfficient Support for Multicomputing on ATM Networks
The emergence of a new generation of networks will dramatically increase the attractiveness of loosely-coupled multicomputers based on workstation clusters. The key to achieving high performance in this environment is efficient network access, because the cost of remote access dictates the granularity of parallelism that can be supported. Thus, in addition to traditional distribution mechanisms...
متن کامل